165 research outputs found

    Inferring meta-covariates in classification

    Get PDF
    This paper develops an alternative method for gene selection that combines model based clustering and binary classification. By averaging the covariates within the clusters obtained from model based clustering, we define “meta-covariates” and use them to build a probit regression model, thereby selecting clusters of similarly behaving genes, aiding interpretation. This simultaneous learning task is accomplished by an EM algorithm that optimises a single likelihood function which rewards good performance at both classification and clustering. We explore the performance of our methodology on a well known leukaemia dataset and use the Gene Ontology to interpret our results

    Random forest for gene selection and microarray data classification

    Get PDF
    A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods

    Evolutionary Computation for Optimal Ensemble Classifier in Lymphoma Cancer Classification

    Full text link
    Abstract. Owing to the development of DNA microarray technologies, it is possible to get thousands of expression levels of genes at once. If we make the effective classification system with such acquired data, we can predict the class of new sample, whether it is normal or patient. For the classification system, we can use many feature selection methods and classifiers, but a method cannot be superior to the others absolutely for feature selection or classification. Ensemble classifier has been using to yield improved performance in this situation, but it is almost impossible to get all ensemble results, if there are many feature selection methods and classifiers to be used for ensemble. In this paper, we propose GA based method for searching optimal ensemble of feature-classifier pairs on Lymphoma cancer dataset. We have used two ensemble methods, and GA finds optimal ensemble very efficiently.

    Isometric Sliced Inverse Regression for Nonlinear Manifolds Learning

    Get PDF
    [[abstract]]Sliced inverse regression (SIR) was developed to find effective linear dimension-reduction directions for exploring the intrinsic structure of the high-dimensional data. In this study, we present isometric SIR for nonlinear dimension reduction, which is a hybrid of the SIR method using the geodesic distance approximation. First, the proposed method computes the isometric distance between data points; the resulting distance matrix is then sliced according to K-means clustering results, and the classical SIR algorithm is applied. We show that the isometric SIR (ISOSIR) can reveal the geometric structure of a nonlinear manifold dataset (e.g., the Swiss roll). We report and discuss this novel method in comparison to several existing dimension-reduction techniques for data visualization and classification problems. The results show that ISOSIR is a promising nonlinear feature extractor for classification applications.[[incitationindex]]SCI[[booktype]]紙本[[booktype]]電子

    An Observational Overview of Solar Flares

    Full text link
    We present an overview of solar flares and associated phenomena, drawing upon a wide range of observational data primarily from the RHESSI era. Following an introductory discussion and overview of the status of observational capabilities, the article is split into topical sections which deal with different areas of flare phenomena (footpoints and ribbons, coronal sources, relationship to coronal mass ejections) and their interconnections. We also discuss flare soft X-ray spectroscopy and the energetics of the process. The emphasis is to describe the observations from multiple points of view, while bearing in mind the models that link them to each other and to theory. The present theoretical and observational understanding of solar flares is far from complete, so we conclude with a brief discussion of models, and a list of missing but important observations.Comment: This is an article for a monograph on the physics of solar flares, inspired by RHESSI observations. The individual articles are to appear in Space Science Reviews (2011

    Statistical strategies for avoiding false discoveries in metabolomics and related experiments

    Full text link
    corecore